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AUTOMATIC CREATION OF A CODE GENERATOR 
FROM A MACHINE DESCRIPTION* 



Abstract 

This paper studies some of the problems involved in^ 
attaining machine independence for a code generator, similar 
to the language independence and the token independence at- 
tained by automatic parsing and automatic lexical systems. 
In particular, the paper examines the logic involved in two 
areas of code generation: computation and data reference. 
It presents models embodying the logic of each area and 
demonstrates how the models can be filled out by descrip- 
tive information about a particular machine. The paper 
also describes how the models can be incorporated into a 
descriptive macr o code generating system (DMACS) to be 
used as a tool by a language implementer in creating a 
machine independent code generator, which can be made 
machine-directed by a suitable description of a particu- 
lar machine. 



*This report reproduces a thesis of the same title submitted 
to the Department of Electrical Engineering, Massachusetts 
Institute of Technology, in partial fulfillment of the re- 
quirements for the degree of Electrical Engineer, March 1971. 
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CHAPTER 



1.1 INTRODUCTION 



The process of translating a high level language into 
machine instructions is traditionally divided into three 
distinct problems: lexical analysis, syntactic analysis, and 
code generation. The flow of data in such a translator is 
outlined in Figure 1.1. 

Source 
Program 



LEXICAL 
ANALYSIS 



SYNTACTIC 
ANALYSIS 



— > 



CODE 
GENERATION 



Machine 
Code 



Figure 1.1: Simple Diagram of a Compiler 

The lexical analyzer accepts a string of characters and 
groups these into identifiers and operators, etc., thus 
creating a string of lexical 'tokens'. The parser analyzes 
the underlying syntactic structure of this string of 
tokens, outputting either a sequence of macro operations or 
a parse tree. The code generator then translates the macros 
(or the parse tree structure) into machine instructions for 
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a particular target machine. 

Both lexical analysis and syntactic analysis have 
been intensively studied. Johnson et al . U) describe a 
system which allows a lexical analyzer to be automatically 
created from a series of regular expressions describing 
possible input lexical tokens. Similarly, numerous parsing 
schemes (1,2,3) have been developed which allow parsers of 
varying power to be created automatically from a 
context-free BNF description of a language. Very little 
work, however, has been done to similarly formalize and 
automate code generation. The present research represents 
an attempt to isolate some of the problems involved in code 
generation and to show how a code generator can be 
automatically created from a description of the computer 
upon which the code is to be run. 

The research does not attack all the problems that 
such an automatic code generating system would have to 
handle. Rather, it deals with two subprohlems corresponding 
to two common types of macro, namely: 

1. computational macros, such as ADD, MULTIPLY, OR, 

etc. ; 

2. data reference, such as subscripting and structure 
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reference. 
In this paper, we examine both types of macro In turn and 
develop a model for the logic of such a macro. We then show 
how a system can be set up to perform the machine dependent 
part of such macro logic from machine descriptive 
Information. 

The two models developed for the operation of the two 
types of macro are different. As a result, the paper can be 
considered to contain two relatively Independent topics: 
the first dealing with computational macros, and the second 
dealing with data reference macros. 

1.2 PREVIOUS WORK 

Although little work has been done to formalize code 
generation, a great deal of work has been done on the 
related problem of language transferability. One approach 
to this problem Is that of the 'mobile programming system' 
of Orgass and Walte. (5,6) In their system, the source 
language Is translated Into a series of simple macros. Then 
a user-written set of macro definitions translates the 
macros Into machine code. The problem of generating code 
for a new machine reduces to the problem of recodfng the 
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macro definitions. 

A second approach to language transferability is that 
of the UNCOL macro language (7,8). UNCOL (UNiversal 
Computer Oriented Language) was developed in an attempt to 
create a universal macro language into which all high-level 
languages could be translated and which Itself could be 
translated into any machine code. If sucessful, the UNCOL 
system would have solved the problem of language 
transferability, since only one translator would ever have 
to be written for a language, and only one code generator 
for a machine. The Orgass and Waite system differs from the 
UNCOL approach in that their macro language was specifically 
tailored to their source language. In practice, the 
restriction Imposed by having only one Intermediate language 
for all source languages and all machines has proven too 
confining for a practical solution. 

The two systems just described are similar In that 
both attempted to solve the problem of language 
transferability by letting the user specify information 
about his machine in procedural form. Most of the 
information about machine structure Is burled Implicitly in 
the coding of the macros. Such a procedural approach has 
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been used in all major published work on code generation. 
In contrast, the present work uses information about machine 
structure given in explicit, descr ? ptive form. 

1.3 BRIEF HISTORY OF CODE GENERATION 

Early languages had very few data types. Fortran, for 
example, had only two data types. Similarly, early machines 
tended to have a small number of special -purpose registers. 
For such language-machine pairs, the process of generating 
code tended to be straight forward. A macro generally 
consisted of a short, independent section of logic which 
performed a few simple tests and then output code. Thus a 
very simple procedural language could let the user define 
these macros (12). 

With the Introduction of more complicated machines 
and of languages with more data types, some of which (such 
as bit-strings) may be more complicated, code generation has 
become a harder task (9,13). Separate modules have become 
desirable to handle register manipulation and to handle 
data-dependent logic for the various data types. Such a 
modular approach allows a macro to be written fairly 
compactly, calling these modules as subroutines to locate 
free registers and to return usable representations of 
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operands (such as a displacement and registers containing a 
base and an index). 

In a traditional macro system, all of these modules 
and macros must be written by the user using a procedural 
language provided for the purpose. Due to the complexity of 
modern languages and machines, such a macro language can no 
longer be a very simple one. Similarly, the job of writing 
a code generator is much more complex. 

1.1* DMACS: A DESCRIPTIVE MACRO SYSTEM 

This paper describes an automatic code generating 
system named DMACS. There are two steps in creating a 
code-generator using DMACS. The first step is to define a 
set of procedural macros in a machine independent, somewhat 
skeletal form. The second step is to supply information 
describing the computer for which code is to be generated. 
DMACS uses this information to flesh out the macro 
definitions. The two steps are quite independent, so that 
once the first step is done for a language, the second step 
can then be done for a variety of object machines. 
Similarly, once a machine has been described, implementing a 
second language requires little change to the machine 
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descrf pt Ion. 

The first step can be thought of as defining the 
semantics of the language using machine Independent 
primitives. The second step can be thought of as defining 
the structure of the target machine. Examples of the two 
steps are discussed In Chapters 3 and U. To facilitate 
these two steps, DMACS provides two languages: 

1. MIML- a procedural machine independent macro 
language, and 

2. OMML- a declarative object machine macro language. 
Programs written In the two languages are bound together by 
the DMACS system. 

Figure 1.2 outlines how the DMACS system Is used. As 
can be seen, the traditional complle-tlme vs.. run-time 
distinction has proliferated Into four separate 'times' In 
viewing DMACS as a whole. 

1. Macro definition time- when a language impl ementer 
presents his machine independent macros to DMACS. 

2. Machine description time- when a machine specifier 
Inputs a description of his machine to fill out the 
machine independent macros. 
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1. Macro Definition Time: 
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3. Language compilation time- when a programmer inputs 
his source program to the compiler as a whole. 

h. Program execution time- when that compiled program is 
actually executed. 

1.5 OVERVIEW 

The present research develops models of two types of 
macros: computation and data reference macros. At the same 
time, the paper illustrates how these models can he huilt 
into DMACS as tools. These tools can be used by a language 
implementer to create machine independent macros defining 
the semantics of his language which can be filled out from a 
machine description. 

Chapter 2 gives the reader an overall introduction to 
code generation and to the DMACS system. It also discusses 
some of the restrictions as to possible machine structure 
which are assumed in the following chapters. 

Chapter 3 presents a model of the logic of 
computational macros. The model pictures a code generator 
as a state machine whose state is determined by the location 
of the values used in generating code. In the model, each 
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computational macro has 'permitted' states for Its operands/ 
from which code can be emitted. For the IRM-360, for 
instance, the permitted states for integer addition would 
allow both operands in registers or one operand in a 
register and the other in a word of core memory. To 
generate code for such a macro, the code generator must make 
a transition into a permitted state and then emit an 
appropriate instruction sequence from that state. 

Using a procedural macro system, the user specifies 
how such state transitions are to be made. In a descriptive 
system such as DMACS, the transitions must be performed 
automatically from a description of the register and memory 
structure of a machine, and of the paths (load, store, 
register-register transfers) between core memory and 
registers. 

Chapter k turns to the problem of achieving the same 
machine independence for data reference macros. To achieve 
this goal, a data definition facility is built into DMACS. 
The language implementer writes his data reference logic in 
terms of the primitives of the facility. A machine 
specifier then describes his machine memory and how source 
data items are mapped into that memory. DMACS can then 
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characterize these source Hata Items In terms of the 
primitives of the data definition facility. As a result, 
the macro logic Is able to operate upon them. 

In summary / the research Is a step towards creating 
models of two aspects of the code generation process/ and 
towards abstracting code generation from any particular 
machine. In this paper we show how these models can be 
Implemented as tools to be used by a language Implementer to 
create a machine Independent code generator which can be 
filled out from a machine description. Furthermore, It Is 
seen that this approach to code generation, as a natural 
by-product, leads to a clean separation of the semantics of 
a source language from the structure of a particular target 
machine, a separation which Is often hard to Isolate in a 
compiler with a code generator oriented towards a particular 
machine. 
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CHAPTER II 
A DESCRIPTION OF A CODE GENERATOR 

2.1 INTRODUCTION TO CODE GENERATION 

Code generation Is the last major task in the 
translation of a high-level language into machine language. 
A code generator receives its Input from the syntactic 
analyzer (the parser). Although In some compilers the input 
is in the form of a parse tree, in this paper it is assumed 
that the input Is in the form of a linear sequence of macro 
operations. 

A = B + C * D; 



/ \ 

A + 

/ \ 

B * 1 MUL C,D 

/ \ 2 ADD 1,8 

CD 3 ASSG A, 2 



Parse Tree Macros 

This assumption is not a restriction, however, since a parse 
tree can readily be converted into such a sequence of 
macros. The task of the code generator is to convert the 
macros into machine instructions. 
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In a compiler for a complex language with many data 
types, the code generator is often allowed direct access to 
the symbol table constructed by the parser. The information 
in the symbol table can then be used directly to generate 
the correct code to access the different data items. The 
data flow in such a system is illustrated below. 



Source 
Program -» 



PARSER 



Symbol 
Table 



Macros 



CODE 
GENERATOR 



Mach ine 
Code 



The parser converts the source program into macros, while 
simultaneously building the symbol table. The code 
generator then accepts both the macros and the symbol table 
as input for generating machine instructions. 

A macro line consists of a line number, an 
operation, and that operation's operands: ie. 1 ADD X,Y. In 
an actual compiler, the line number is usually implicit, and 
the operation and the operands can be thought of as 
pointers. The operation is a pointer into a table of macro 
definitions. The operands are either pointers to the symbol 
table entries describing the values to be operated upon, or 
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pointers to previous macro lines Indicating the results of 
previous macro operations. 

The paper discusses two particular kinds of macros: 
computational macros and data reference macros. The 
following example illustrates both types of macros. 

A(l )=B+C(J)*D 



I 


SS 


C,J 


?+l 


MUL 


i,D 


i+2 


ADD 


1 + 1, B 


i+3 


SS 


A,l 


i+4 


ASSG 


i+3, i+2 



In this example, SS (subscript) Is a data reference macro, 
and MUL and ADD are computational macros. 

As an example of computational macro logic, consider 
Integer addition on the IBM 360. The 360 has two Add 
Instructions for integers: 'A* which adds a word of memory 
to a register, and 'AR' which adds two registers. In 
generating code for an ADD macro, the code generator must 
check the location of the values to be added to see if 
either of the Instructions can be emitted directly. If not, 
the code generator must emit Instructions to load one (or 
both) Into registers. If, In the process of finding a 
register to load into, the code generator must cause the 
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previous contents of a register to be stored, the new 
location of the stored value must be recorded. Furthermore/ 
Jf one of the values to be added Is not directly accessable, 
(le. a bit string value), the code generator must emit load 
and shift Instructions to Isolate that value In a register. 
Finally, after emitting the appropriate add Instruction, the 
code generator must record the location of the macro's 
resul t. 

Similar examples of data reference Topic are given in 
Chapter h. 

2.2 INTERNAL TARLFS 

The symbol table contains information about all the 
values (variables) declared by the programmer. At some 
point before code generation core locations must be 
allocated for these variables. The core location 
information can be stored in the symbol table entry for each 
item. Exactly how core allocation might be done is 
discussed in Chapter k. In addition to the values declared 
by the programmer, the code generator must also record the 
location of values which have been computed by previous 
macro lines, but not yet used. In most machines, a 
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computation leaves Its result In some register. Since the 
result can often later be used unmoved, It is desirable to 
leave It in the register if possible. If, however, an 
intervening macro requires that register for its 
computations, It is necessary to store its contents In a 
'temporary' In core and to remember that this has been 
done. 

To keep track of the location of such previous macro 
results, three tables are built Into the code generator: a 
macro result table (MRT), a register state table (RST), and 
a temporary table (TT). 

MRT: The macro result table records the location of a 
macro's result(s) if any. The MRT has one entry for 
each macro line. Fach value recorded in the entry 
consists of a pointer to the register or temporary where 
the value is located. 

RST: The register state table contains one entry for 
each register. Each entry indicates whether that 
register contains a computed value, or if it is free. 
Each entry recording a computed value contains a pointer 
to the MRT record representing that value. Thus, when a 
register must be stored, the MPT entry can be easily 
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changed to point to the temporary location where the 
value is to be stored. Each RST entry also contains 
fields which are used to flag a register with 
Information to be used in selecting a register to be 
stored. 

TT: A temporary table can be implemented in various 
ways. For the purpose of this discussion, any 
implementation is acceptable. One strategy is to 
allocate a new temporary each time one is needed, in 
which case all that need be remembered outside the MRT 
is the number of the last temporary allocated. A more 
efficient strategy is to reuse temporaries after the 
results they hold are used, in which case the TT must 
have an entry for each temporary allocated. 

2.3 THE 'GETREG' ROUTINE 

The internal tables described in the preceding 
section allow computed values to be left in the registers 
where they are computed. If such tables are not used, every 
computed value must be immediately stored In a temporary, 
which Is clearly undesirable. If values a_re to be left In 
registers, however, a routine must be provided which locates 
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free registers available for use. The paper refers to that 
routine as the GETREG routine. 

The GETREG routine Is passed the name of a register 
class as an argument. It cycles through that class looking 
for a free register. If none are found, the routine picks 
one of the registers and stores Its current contents In a 
temporary, updatlngs the MRT entry pointing to that value. 
The priorities used In selecting which register to store, If 
there Is a choice, are discussed In Chapter 3. 

2.k SOME QUESTIONS TO BE ANSWERED 

The previous sections give a brief Introduction to 
code generation In general. The remainder of the chapter 
attempts to use the Introduction as a framework within which 
to outline exactly what aspects of code generation are to be 
dealt with In Chapters 3 and k. Among the questions to be 
clarified are these: 

1. What different types of machine structure do the models 
presented deal with? Clearly there are many different types 
of machines, ranging from machines like the 7090 with 
special purpose registers, to machines like the PDP-IO with 
general purpose registers, to stack machines, and to 
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microprogrammed machines capable of complicated runtime 
checks. Similarly, machines have differing addressing 
mechanisms: byte-addressing, word-addressing, Indexed or 
unindexed, based or not based, directly addressable or paged 
addressable (as In many small machines), etc. The models 

presented are not capable of handling all possible machine 

structures. 

2. What kind of values do the models presented deal with? 
Possible values in a computer are Integers of different 
precision, booleans, bltstrlngs, floating point numbers of 
different precision, decimal numbers, character strings, 
addresses, etc. The present research is not concerned with 
all of these possible types of values. 

3. How are values allowed to map into the machine 
structure? For instance, are bitstring values to be allowed 
to cross word boundaries? How are different values assumed 
to be accessed? 

k. What is meant by 'machine description'? Intuitively, one 
might expect machine description to entail somehow listing 
registers, core memory units and opcodes. On the other 
hand, might not a low-level code sequence, which 
accomplishes some primitive function such as subtraction or 
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loading a value, be considered to be a reasonable part of a 
'machine description'? This question is discussed in 
Section 2.9. 

2.5 ASSUMPTIONS ABOUT MACHINE STRUCTURE 

The present research makes several simplifying 
assumptions about the structure of possible target 
machines. The assumptions are spelled out in more detail in 
Chapters 3 and k. 

Registers; The machine is assumed to have a set of registers 
for manipulating values. These may be either special 
purpose or general purpose registers. The machine specifier 
describes the registers by naming them, grouping them into 
classes, and defining how they are used in manipulating 
data. Chapter 3 describes more precisely how this Is done. 

Core Memory ; The whole of core memory Is assumed to be 
directly addressable (as opposed to the paged addressability 
found on some small machines). It is assumed that the 
addressing is done In a machine Instruction by either a 
displacement and an index, or by a displacement, an index, 
and a base. The machine specifier must indicate which 
registers may be used as indices and bases. In generating 
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an address, DMACS creates an internal ' generated address ' 
consisting of a displacement, Index, and base (the index or 
base may be nil). If both index and base are present in a 
generated address, however, and the particular target 
machine allows only an index, then DMACS generates code to 
add the base and index together, thus transforming the 
'generated address' into a ' machine address ' for that target 
mach ? ne. 

2.6 ASSUMPTIONS ABOUT VALUFS 

In a complex real-world compiler, many types of 
values can be used as operands. Flson and Rake (9) discuss 
some of the involved problems of writing macro definitions 
for a complicated language (PL/I). The present work does 
not attempt to handle the complexity of such a language; 
rather, it makes certain simplifying assumptions as to the 
types of values to be allowed as operands. The restrictions 
allow a reasonably simple model of code generation to be 
constructed which exposes some of the basic conceptual 
processes and problems involved, without becoming bogged 
down in a huge ad-hoc mess. 



- 27 - 



The model of a code generator presented in thfs paper 
Is set up to handle values whlch / Intuitively, are of the 
Integer (or Integer blt-strlng) and floating point variety; 
values which are manipulated via registers and thus are no 
larger than the registers used on the particular target 
machine. Character-string and decimal values are not 
considered. 

2.7 HOW VALUES ARE REPRESENTED ON THE MACHINE 

There are three general classes of locations for 
values on a machine: a value can be In a register, It can 
be simply accessible in core, or It can be In core but not 
simply accessible. A value Is simply accessible If Its 
address can be put directly Into a computational machine 
Instruction, such an Add instruction. (Thus a value may be 
addressable In a special load Instruction yet not simply 
accessible). For instance, a byte on the ISM-360, even 
though addressable, is not simply accessible for 
computation. It must first be Isolated In a register. 

Let us examine how a value misht fall into each of 
these classes. 
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Registers : The only values which may be In a register (at 
the start of a macro expansion) are values computed hy 
previous macro lines. 

Simply Acccesslble : Simply accessible values Include both 
results of previous macro lines which have been stored In 
temporaries (which are assumed to be simply accessible 
locations), and values declared In the source program which 
have been mapped Into simply accessible core memory units. 
Chapter k explains exactly how this mapping Is done. 

Not Simply Accessible : This class Is composed of values 
which cannot be directly operated upon by computational 
Instructions. They must first be isolated in a register 
before they can be used. Such values include individual 
bits, and bit-strings which are not on wholely accessable 
boundaries. 

2.8 LOAD/UPDATE ROUTINES 

The fact that not all values are simply accessible 
gives rise to the concept of a load/update pair: a pair of 
routines to access and to update a value. The idea of 
characterizing a data Item by a pair of load/update routines 
was first formulated by Strachey (11). A simple example of 
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such an unaccessfble data Item is a bit-string within a 
word. Its location might be represented by an address 
(perhaps indexed and based) and a bit displacement within 
the addressed memory unit. Its load/update pair might 
consist to two routines which take the 'location 1 and 
generate code as follows: 

1. Load Routine: 

a. load the memory unit (ie., word) Into a register 

b. shift left to eliminate high-order bits 

c. shift right eliminating low order bits and 
right-adjusting the value in the register 

2. Update Routine: 

a. shift the new value to the correct target position 

b. load the target word Into a register 

c. use a bit mask to zero out the target byte 

d. OR the two words together 

e. store the result 

In practice such a value has two kinds of 'location' 
and correspondingly two load/update pairs: one for when the 
location of the string within the word is known at compile 
time, and one for when It Is computed at run time. The 
routines are further complicated if a value extends across a 
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word boundary. 

The load/update problem arises from the fact that 
programmers are interested in values that do not map 
directly into accessible units. Generally only an address 
can be put into a machine instruction. If a computational 
machine instruction could accept an address, starting bit, 
and bit length, then the complexity of the load/update 
routines would disappear. An alternate approach might be to 
have special hardware load and store instructions to access 
bits of a word. This would retain the load/update 
framework, but the routines would consist of only one 
instruction. 

2.9 MACHINE DESCRIPTION 

Using DMACS, a machine specifier can implement a 
language by describing various features of his machine. In 
the next two chapters, the details of such a description are 
examined in more detail. 

Parts of the 'description' consist of listing names 
of registers and of core memory units and of describing how 
these relate to one another. Another part of this 
description, however, involves writing short low-level code 
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CHAPTER I I I 
A CODE GENERATOR AS A STATE MACHINE 

3. 1 OVERVIEW 
3.1.1 THE STATE MACHINE 

Chapter 3 presents a model of the logic of a 
computational macro. This model pictures a code generator 
as a state machine whose state is determined hy the location 
of the values used to generate code. The location of a 
value may he an accessible core location, a non-access ihl e 
core location, or a register. In the model, each 
computational macro has one or more permitted states for its 
operands from which code can be emitted. To generate code 
for a macro, the code generator must make a transition into 
one of the permitted states and emit a particular code 
sequence from that state. 

In a procedural macro definition language, the user 
explicitly specifies these transitions himself. In a 
descriptive system such as DMACS, logic to perform 
transitions is deduced automatically from machine- 
descriptive Information. The chapter shows how such an 
automatic mechanism is built into DMACS to perform 
transitions given a machine description describing register 
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structure, permitted states for computat Ion, and code 
sequences which perform these computations. Not 
surprisingly, the automatic mechanism makes certain 
restricting assumptions as to ohject machine structure. 
Thus, the model is a somewhat restricted one presented to 
isolate the basic ideas involved, and to provide a basis 
upon which a more general system can be built. 

3.1.2 THE STATE OF THE MACHINE 

In this chapter, the term 'state' is used in two 
contexts: the 'state' of the code generator as a whole, and 
an input, output, or permitted 'state' of an Individual 
macro. 

1. The state of the code generator is determined by the 
locations of all the values which are to be used as 
operands to any macro. 

2. The i nput state of a macro is determined by the 
location of the values passed to it as operands. 

3. A permi tted state of a macro is a particular 
configuration of operand locations from which code can 
be emi tted. 

k. An output state of a macro is determined by the 
location of the result of the computation. 
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3.2 A SIMPLF FXAMPLF 

The following simplified example illustrates how the 
state machine concept is used. The example concentrates on 
the integer addition for the IRM-360. 

1. Input States : For simplicity, let us restrict operands to 
two locations: 

1. registers of class 'R 1 (abbreviated 'P.') 

2. accessible storage (abbreviated 's') 

Thus input states for two operands can be described by the 
following pairs (s,s), (s,R), (R,s), or (R,R). 

2. Permitted States : The IRM-360 has two instructions which 
perform integer addition. Permitted states are (R,s), 
(s,R), and (R,R). From (R,s) and (s,R) a 

storage-to-register Add instruction, 'A 1 , is emitted. From 
(R,R) a register-register Add instruction, VR', is 

emi tted. 

3. A Machine Independent Macro : If the source language 
allowed both integer and floating point operands, the 
language implementer might write a machine independent ADD 
macro with logic as follows: 
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are discussed In Chapter k. A more detailed description of 
the 360's register structure Is found In Section 3.7. 

Next, the machine specifier defines Integer addition. 

IADD al,a2 (commutative) 

from R(al) / R(a2) emit AR al,a2 result R(al) 

from R(al),S(a2) emit A al,a2 result R(al) 

This description defines two permitted states, code to he 

emitted from each state, and the location of the macro 

result. In the first state, both operands are In registers. 

From this state, an 'AR' instruction Is to be emitted. The 

result Is to be recorded In the register containing the 

first operand. The declarations are used to fill out the 

MIML macro. The attribute 'commutative' Indicates that 

addition Is commutative, and thus R(a2),SXal) will be 

Included as a permitted state without being declared 

expl Ici tl y. 

Notice that the declarations are essentially a 
description of IBM-360 integer addition. 

5. Advantages : Because the state machine model is built 
Into DMACS, both the language Implementer and the machine 
specifier find their tasks lightened. The language 
implementer can write a very simple source macro without 
worrying about machine structure. He need not perform tests 
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to ascertain the state of the operands, nor transform the 
input operand state in any way. The machine specifier, in 
turn, is ahle to implement the macros by describing bis 
machine without worrying about the constructs of tbs source 
language or the internals of the compiler. 

6. The Role Of DMACS : The machine specifier defines his 
register structure, the permitted states filling out each 
'primitive' (such as IADD) in the machine independent 
macros, the code sequences to be emitted from each permitted 
state, and his data pathways including load and store 
instructions. From this descriptive Information, DMACS must 
deduce three things: how to select a target permitted state 
for a given Input state, how to reach that state, and bow to 
obtain a free register of a given class when, in the process 
of making a transition, it needs to load a value. 

The remainder of this chapter deals with these topics 
in more detail and discusses the problems involved. 
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3.3 MACHINE STRUCTURE 
3.3.1 REGISTERS 

The code generator must be able to manipulate values 
in and out of registers to attain permitted states. In 
trying to incorporate automatic register handling logic into 
DMACS, there are two conflicting goals. First, the user must 
be able to describe his registers flexibly enough to include 
a reasonably large class of machines. Second, there must be 
enough restrictions so that the logic which attains 
permitted states can be generated from this description 
automatically. These two goals conflict since the more 
flexible the model is, the harder it is to incorporate into 
an automatic system. The assumptions as to register 
structure outlined in this section are restrictive, but 
provide a base for later extension of the model. 

In attaining permitted states the system must be able 
to find a free register of a given class, to load and store 
the contents of any register, and to transfer a value from 
one register to another. To allow this, the machine 
specifier defines the following: 

1. The Machine Registers: (R =(rl,r2, r3 ...rn) 
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2. Classes of Registers: (R1,R2, . . . Rn); Rl C J? 

The classes are defined so that every register is In at 
least one class, if only by itself, and so that any two 
classes are either subsets, equal, or disjoint. There 
Is no partial overlap. 

3. Pathways to Core: Each class of registers is assumed 
to have a direct path to and from core. There is no 
need to go through a second register In either loading 
or storing. This Is a simplifying assumption which 
might be relaxed in a more powerful extension of the 
model. (A stack machine, for Instance, does not conform 
to this assumption). The machine specifier must define 
the load and store instructions used In these pathways. 

k. Paths between Registers: The machine specifier must 
define any available register to register transfers. 

5. Relationships between Registers: The machine 
specifier may define relationships between registers. 
These can be used for such register-register 
relationships as even-odd pairs. He may also specify 
that In certain conditions the use of one register 
implies that a related register must be made available 
as wel 1 . 
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In this fash?on / the user describes his register 
structure. Section 3.3.3 describes how this Information Is 
used by DMACS to construct the GETREG routine to obtain a 
free register of a given class. 

3.3.2 SAMPLE REGISTER DESCRIPTION: IRM-360 

re lass REG:r2 / r3 / r^,r5 / r6 / r7 / r8 / r9 / rl0 / rll 

rclass ODDREG: r3,r5,r7, r9, rll 

relation EPAIR (stored iGDDR^) 

r3:r2 

r5: rk 

r7: r6 

r9:r8 

rlltrlO 

rpath W0RD->REG: L REG, WORD 
rpath REG->W0RD: ST REG, WORD 
rpath REG->ODDREG: LR ODDREn,RFC 

These declarations define two register classes. For 
each member of the class ODDREG, a related EPAIR register Is 
declared. The attribute (stored rODDREG) means that when an 
ODDREG register Is called for, Its related EPAIR register 
must be made available as well. 



- kl 



3.3.3 THE 'GETREO' ROUTINE 

The GETREG routine is called by DMACS when in 
performing a transition a free register of a given class is 
needed. The routine must be adjusted by DMACS using the 
machine specifier's description of his register structure / 
so that it operates correctly for the particular machine 
registers and register classes involved. 

The GETREG routine cycles through the register class 
it receives as an argument attempting to find an empty 
register. If none are empty / the routine must choose a 
register to store based on the 'flags' attatched to the 
registers. The flags are used to protect values in 
registers so that they will not ^e stored unless necessary. 
The mechanism for flagging is complicated by the fact that a 
macro can be called as a subroutine during a given macro 
expansion. The priorities used in protecting registers are 
discussed in more detail in Section 3.5. The net effect of 
the priorities is that the most recently set registers are 
stored last. The GETREG routine also must handle situations 
when related registers must be freed at the sane time. 
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3.h THE AUTOMATIC TRANSITION 
3.U.1 INTRODUCTION TO THE TRANSITION 

Performing a transition involves transforming any 
possible input state of a macro into one of that macro's 
permitted states. The input state of a macro is determined 
by the location of the values passed to it as operands. 
These operands can be classed as follows: 

1. s - in an accessible storage location in core memory 

2. Ri - in a register of register class Ri 

3. Rjf - in a non accessible core location / requiring a 
code generating load function which will isolate the 
value in a register of class Rj . (The concept of having 
to apply a function to an operand could apply to mode 
conversions as well as to loading non-accessible 
values. Thus / although this paper deals only with Rjf 
values in a limited context / the concept involved is a 
more general one.) 

For the sake of simplicity, this section deals only 
with two-operand macros, (such as APD X,Y). For a two 
operand macro, input states are taken from 

(s U Ri U Rif) X (s U Ri U Rif) 
Permitted states are taken from 
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(s URI) X (sURi) 

On a machine that has one class of registers (say R), 
Input states are (s,s), (s,R), <s,Rf), etc. Permitted 
states for an arbitrary macro might be (s,R) and (R / R). 
Thus a reasonable transition to make from Input state (R / s) 
Is to permitted state (R,R). 

Performing the transition Involves choosing a target 
permitted state to aim for and a path to reach that state. 
In the remainder of the chapter, we assume that the task of 
performing a transition can he seen as two distinct 
problems : 

1. Selection of a target permitted state for each Input 
state, based on the cost of transforming each operand 
location. 

2. Given an Input state and Its target state, 
determining In what sequence changes are to be made. 

These two steps are closely related. On some machines, the 
two steps can not be performed Independently. For Instance, 
on a stack machine one can not consider the cost of 
transforming the location of each operand Individually 
without considering the sequence of transformations. On a 
machine that conforms to the assumptions that we have made 
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about register structure / however/ the separation of these 
two steps Is possible. 

3.U.2 SELECTING A TARGET STATE 

The selection of a target state for each possible 
Input state of a macro Is the first step In setting up the 
automatic transition. If there Is only one permitted state, 
this selection Is trivial. Otherwise a target for each 
input state must be selected from among several permitted 
states. Clearly, some criteria Is needed for measuring the 
cost of changing states. For each Input state, the permitted 
state yielding the lowest such cost can then be selected as 
a target. The cost criteria used In this section Is the 
number of Instructions required (not counting Inadvertent 
storing of values since these are not predictable in 
advance); Since we assumed in section 2.3 that each 
register has a direct path to and from core, the maximum 
cost of changing the location of one operand is 2 (storing 
the value from one register, and then loading It Into a 
second). Rif values, which require a function to load them 
Into a register, are treated as If they were already In that 
register since the function is to be applied in all cases 
and is thus a constant cost. 
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Example: Assuming two register classes, R and R', and a 

register-register transfer, sample costs are: 

(input state)( perml tted stateHcost) (comment) 

(s,s) (s,R) 1 load 

(s,R) (s,R) 

(s,R) (R,s) 2 store, load 

(s,R) (s,R') 1 transfer 

(s,Rf) (R,R) 1 load 

(Rf,Rf) (R,R) 

Figure 3.1: Sample Transition Costs 

An alternate cost criteria might be Instruction 
execution times. In either case, the target selection 
algorithm simply selects for each Input state that permitted 
state which can be reached at lowest cost. The selection of 
a target state for each input state need not be performed 
every time the macro Is called. It can be compiled Into a 
table at machine description time. 

3.U.3 SEQUENCING THE TRANSITION 

Once a target state has been selected for each Input 
state, there remains the problem of deciding the order in 
which changes are to be made. This chapter first outlines a 
general strategy which accomplishes sequencing for all 
possible machine structures. The general strategy is called 
'blind' sequencing for reasons that become apparent. The 
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Algorithm to select a target permitted state 
for an Input state of a macro 



Input state: ! ■ ( ? 1, 12) 

Permitted states: pj - (pjl,pj2) j 



l,n 



j - 1 

target = nil 
mincost = oo 



totalcost= 

cost( U,pjl)+cost(i2,pj2) 




target = j 
mincost - 
totalcost 



NO 




The function cost(f,p) determines the number of instructions 
required to change i to p. 
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chapter later discusses a second strategy called 
'predfctfve' sequencing which, when possible, might be more 
efficient. The following discussion concentrates on 'blind' 
sequencing, since blind sequencing always works and Is a 
good vehicle for outlining the problem Involved. 

To Illustrate the problem, let us consHer the 
transition from Input state (s,s) to target state (R,R). 
There are two possible paths, as this graph Indicates: 




In the graph, each node represents a state, square nodes 
represent permitted states, and paths from one state to 
another are represented by arcs. The arcs are labeled to 
indicate what operation is being performed to which operand. 
Thus 11 Indicates that the f I rst operand Is being loaded 
when that arc is followed. 

In the graph, there are two paths from (s,s) to 
(R,R). In this particular example, the paths are equally 
efficient and either can be selected. An Important thing to 
notice In this example is that every arc connects one 
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possible Input state to another. This Is true because we 
have taken all possible combinations of locations as 
possible Input states. Since each arc must lead to another 
possible Input state / each state can be examined In turn and 
one labeled arc can be drawn from Its node. Drawing this 
one arc for each state completes the graph. A decision 
procedure for determining which arc to draw for a given 
state Is given In Figure 3.2. 

3. It. I* ACCIDENTAL TRANSITIONS 

There Is one complication to be considered In 
performing the sequencing. It Is Illustrated below: 




Figure 3.2: Sample Sequencing Graph 

In Input state (R,Rf), the first operand Is already In a 
register. A code generating function Is to be applied to 
load the second operand. The function can generate code 
using registers/ and even might call macros as subroutines 
to perform runtime computation. For instance, to load a hit 
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Algorithm For Blind Sequencing 



Problem: Given an Input state and a target state, determine 
what operand to transform first. (Making this decision for 
each possible input state completes the graph). 

Each operand can be expressed as one of the following: 



Input 
s 
R 
Rf 



Target 

s 

R 

R' (R' f R) 



The first operation to be performed to a given operand can 
be expressed as follows: 



Input 

s 
R 
Rf 



Target 



s 


R 


R» 


nil 


1 


1 


St 


nil 


t 


fl 


f2 


f3 



l='load» 
st='store' 
t=' transfer 1 
f?='apply 

function f 



These operations can be given priorities: 

st > fl > f3 > f2 > t > 1 

The effect of the priorities is to give highest precedence 
to storing values, next highest to applying functions, then 
to register transfers, and finally to simple loads. 

Sequencing Is done by labelling each operand of each input 
state by st, fl, f2, f3, t, 1; and then drawing the arc 
corresponding to the operation with the highest priority. 
If both have equal priority, then either can be picked. 

Figure 3.3 
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value extending across a word boundary It is necessary on 
the 360 to load a register pair and perform double word 
shifts to isolate the value. Thus, the function may require 
the register containing the first operand to be stored. If 
this happens, a transition to (s,R) occurs, rather than to 
(R,R). This is called an accidental transition from an 
unstable state to an alternate state. To accomodate such 
unexpected but unavoidable transitions, the graph is 
augmented to include dotted arcs from such unstable states 
to the appropriate alternate state. 




Figure 3. it: An 'Unstable* State 

The graph In Figure 3.1* indicates that applying f2 to the 
input state (R,Rf) should lead to (R,R) but might lead to 
(s,R). Full examples of such graphs are given in Figures 
3. 5 and 3.6. 

Let us consider for a moment how such accidental 
transitions can be implemented In DMAPS. In the performance 
of the transition (R, Rf )->(R, R), the register containing the 
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input states: (s U R U Rf) X (s U R U Rf) 
permitted states: (R/s), (R/R) 




Figure 3.5 



input states: (s U R U R 1 U Rf) X (s U R U R' U Rf) 
permitted states: (R',s), (R',R) Rf)R' = 



st2 




st2l 



Blind Sequencing Graphs 
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first operand should not be stored unless necessary. This 
can be assured by flagging that register's entry in the P.ST 
(register state table). When storing any register, the 
GETREG routine must attempt to respect these register 
flags. After the function has been completed, the code 
generator must check to see If the first operand is still in 
a register. If not it must follow the dotted arc to reach 
its next state. 

3.4.5 A REVIEW OF SEQUENCING 



The sequencing strategy described in the previous 
sections is called 'blind' sequencing because it involves 
applying load functions without looking ahead to see how 
these functions behave. Consider the following situation 




Figure 3.7: Sample Sequencing Craph 



'Blind' sequencing arbitrarily selects one of these paths. 
If it Is possible to determine in advance whether a given 
function can be applied without disturbing a value already 
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in a register, then a more effective strategy can be used. 
The second strategy is called 'predictive' sequencing. 
Predictive sequencing is not so general as blind sequencing 
since for an arbitrary machine with register-register 
relations like even-odd pairs, the exact needs of an 
arbitrary function (even if quite simple) may be very 
difficult to predict and control. 

This chapter concentrates on the blind seauencing 
approach. Predictive sequencing is mentioned primarily to 
put the problem in perspective. The main argument against 
developing a general predictive sequencing strategy for an 
arbitrary machine is that it would be very difficult to 
design, and would run the risk of using more instructions at 
compile time than were ever saved at runtime. In code 
generation, it is generally true that if elaborate 
optimization is to be done, it is most profitably done on a 
fairly global basis, such as allocating registers over 
loops, removing invariance from loops, consolidating common 
subexpressions, etc. Blind sequencing has the advantage of 
affording a degree of local optimization (compared to a 
system which stores all registers before calling a function) 
without any really elaborate machinery. 
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This concludes the introduction to sequencing. Let us 
step back for a moment and evaluate briefly what these blind 
sequencing graphs imply. A blind sequencing graph is 
compiled at machine description time by DMA OS. Any 
particular graph applies to a particular machine, but the 
concept of such a graph is a general formalism and is 
applied to all machines. 




Figure 3.8: Sample Sequencing Oraph 

To understand the signifigance of this fact, let us consider 
what factors govern how frequently the accidental transition 
is followed from (R,Rf) in Figure 3.8. The frequency 
depends both on the source program and on the target 
machine. If the source program uses data-types which 
require simple accessing functions, then the dotted arc will 
tend to be followed less often than if more complex 
data-types are used. Similarly, If the machine has many 
available registers, the dotted arc will tend to be followed 
less often than if the machine has few of them. 
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Notice, however, that neither the programmer nor the 
machine specifier need even know that the problem exists. 
For that matter, neither need the language Implementer. 
Only one person need ever worry about It- the DMACS designer 
who does all the worrying for everyone. 

3.U.6 A GENERAL OVERVIEW OF A TRANSITION 

The automatic transition mechanism looks at the type 
of a macro's operands, looks at the permitted states, and 
then Initiates one or more transformations to attain one. 
This process can be described In general terns: 

1. A system Is In a given state (certain values are In 
certain locations). 

2. It Is desired to transform the system to a new state 
with certain properties (particular values In particular 
locations) . 

3. Functions are available which can effect a desired 
local change to part of the system, but with possibly 
unpredictable side-effects. (A load function applied to 
an Rlf value might store an arbitrary value from a 
register). 

h. It Is desired to make a sequence of such local 
changes and still have the resulting global state 
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wel 1 -defined. 

To accomplish this goal, the mechanism that generates 
permitted states must be able to detect when one function It 
applies stores a register that It expected to be loaded, and 
either reload that value or else pick an alternate permitted 
state. Two potential problems In a system of this sort are 
deadlock and thrash! ng . 

1. Deadlock: Deadlock occurs If a value Is Irrevocably 
locked Into a register by the flagging mechanism, so that It 
can not be stored. If this were possible, It Is easy to 
Imagine a situation where a macro called as a subroutine 
might be unable to obtain the registers It required. Such a 
deadlock can not occur In the system outlined here, since 
the register flags are only Interpreted as requests that a 
register not be stored unless necessary . 

2. Thrashing: In section 3.4.4, accidental transitions were 
described. In such transitions, an attempt to reach one 
state results in an inadvertent transition to an alternate 
state. One might wonder whether such inadvertent 
transitions could be repeated Indefinitely. If so, then a 
thrashing situation might result, in which each sucessive 
attempt to reach a permitted state is thwarted. Fortunately, 
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of a load functfon may result in macros being called as 
subroutines, each macro invocation is numbered 
sequentially. That number is used to flag registers. In 
obtaining a register, the GETREH routine uses the following 
priori ties : 

1. an empty register 

2. an unflagged register 

3. a register with the lowest flag (ie least recently 
set) 

Thus the most recently computed values are the most 
securely protected. As a result, if a register is 
loaded and no arbitrary functions are called it can be 
relied upon to remain in its register. 

The process of macro expansion involves performing 
the following steps, I, II, and III in sequence: 

I. Protect Values Already in Registers: First any values to 
be used by the macro which are already In the correct 
registers are flagged. Such values include operand values 
as well as values to be used as indices or bases to obtain 
an operand value. If a value in a register requires that a 
related register be stored, then make sure that register Is 
stored and flag it as well. 
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Figure 3.9: Performing a Transition 

I!. Perform the transition to a Permitted State: Notice 
that it is in the process of following the sequencing arcs 
that the load functions and the GFTRFrt routine are called as 
subroutines. Load functions are called when a load is 
applied to an Rif value. The GETRFH routine is called when 
a load or transfer arc is traversed. 

III. Perform Emission and Bookkeeping: 

1. For each operand in storage, load any index or base 
values which are not already loaded. 

2. Erase all RST flaps set by this macro. 

3. Emit the code sequence associated with the permitted 
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node attained using the sequencing graph. The code sequence 

was specified by the machine specifier when defining 

permitted states. 

h. Erase the operands from the MRT and RST 

5. Record macro result, if any, in the RST and MPT 

3.6 AN EXTENSION: OPERATIONS TO MEMORY 

A useful extension to the state machine concept, as 
outlined, is to incorporate 'operat ion-to-memory ' 
instructions, such as 'add-to-storage ' . It is simple to 
include this common class of instructions by allowing the 
user to specify alternate destinations for a macro result. 

Example: For the PDP-10, which has such instructions, an 

OMML definition for IADP (defined in section 2.1.2) can he: 

IADD al,a2 

from RE0(al),RF0(a2) emit IADP al,a2 result Rrp(al) 

from RE0(al),W0RP(a2) emit IADD al,a2 result RFP(al) 

or emit IADDM al,a2 result W0PP(a2) 

(The IADD and IADDM instructions being emitted are PPP-10 

opcodes.) The second state declaration indicates that if al 

is in a register, and a2 is in core, then an IADD 

instruction yields a result in the register, and an IADDM 

instruction yields a result in core. 
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To take advantage of such information only four 
modifications need be made to the logic outline^ in this 
chapter: 

1. When emitting code: If such a choice exists and the 
core location is a temporary / then defer emitting the 
instruction, and flag the Register In the RST, 
indicating the two instructions and the core location. 

2. In the GFTRF.G logic: emit an operation-to-memory 
Instruction in preference to explicitly storing a 
value. 

3. In selecting a permitted target state: If there is a 
choice of input states due to deferral of such an 
instruction, then evaluate both possible input states 
and select that one whose target has least cost. If the 
selection requires emission of an operation-to- register 
Instruction, continue to defer its emission until it is 
clear that the value need not be stored. 

k. After sequencing and prior to code emission: First 
emit any necessary op-to-regi ster instructions for Input 
operands which have been defered. 
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Although these simple modifications to the DM.*CS Topic 
certainly lead to no dramatic gains in efficiency/ they do 
represent a useful extension to the state machine concept. 

3.7 SAMPLE MACHINE DESCRIPTIONS 

This section outlines the logic of two simple 
machine independent macros which might he written in MIMl. 
Then OMML descriptions of the IRM-360 and of the PPP-10 
which fill out the macros are. given. 



Machine Independent Macro Logic: 

macro MUL X,Y 

if the types of X and Y are integer 

then IMUL X,Y 
else if the types of X and Y are floating 

then FMUL X,Y 
else error 

macro SUB X,Y 

if the types of X and Y are integer 

the I SUB X,Y 
else if the types of X and Y are floating 

then FSUB X,Y 
else error 



OMML Machine Description of the IBM-360: 

The IBM-360 has one set of registers for integer 
arithmetic and another set for floating point arithmetic, 
and therefore has separate pathways to and from these 
registers. For multiplication and division of integer 
operands, even-odd pairs of registers are used. 
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re lass REG:r2, r3,rU,r5,r6,r7,r8,r9,rl0,rll 
rclass ODDREG:r3,r5,r7,r9,rll 
rclass FREG:frO,fr2,fri*,f r6 
relation EPAIR (stored:ODDREG) 

r3:r2,r5:r<*,r7:r6,r9:r8,rll:10 

rpath WORD->REG: L REG, WORD 

rpath REG->WORD: ST REG,WORD 

rpath FREG->WORD: LE FRFG,WORD 

rpath WORD->FREG: STE FREG,W0RD 



IMUL ml,m2 (commutat I ve) 

from 0DDREG(ml),REG(m2) emit MR 

from 0DDREG(ml),W0RD(m2) emit M 



EPAIR(ml) / m2 
EPAIR(ml),m2 



resul t 
resul t 



ODDRFG(ml) 
ODDPFG(ml) 



(On the IBM-360, 
'odd' register, 
even pa? r. ) 



multiplication requires one operand In an 
The multiply Instruction must refer to Its 



ISUB sl,s2 

from REG(sl),REG(s2) emit SR sl,s2 result RFG(sl) 

from REG(sl),W0RD(s2) emit S sl,s2 result RFG(s2) 

FMUL ml,m2 (commutative) 

from FREG(ml) / FREG(m2) emit MER ml,m2 result FRFG(ml) 

from FREG(ml) / W0R0(m2) emit MF ml,m2 result FRFG(ml) 



FSUR 51^2 

from FREG(sl) / FREG(s2) emit 

from FREG(sl) / W0RD(s2) emit 

from FREG(s2),W0RD(sl) emit 



SFR 51,52 result FPFP(sl) 
SF sl,s2 result FRFG(sl) 
LNER s2,s2;AE s2,sl result 



FRFG(s2) 



(Notice that since a, 'complement register' Instruction, 
LNER, exists for floating point, a state can be specified 
with s2 In a register and si In core). 



OMML Machine Description of the PDP-10: 

The PDP-10 has one set of registers for both Integer 
and floating point arithmetic. Since the PDP-10 has 
operation-to-memory Instructions, all memory-register state 
declarations Include two alternate destinations. 

rclass REG:a,b,c,d,e,f,g, h, l,j,k,l,m,n 
rpath REG->W0RD: MOVEM REG, WORD 
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rpath WORD->REG: MOVE REG, WORD 



IMUL ml,m2 (commutative) 

from REG(ml),RFG(m2) emit IMU! ml / m2 result REG(ml) 

from REG(ml) / W0RD(m2) emit IMUL ml,m2 result REG(ml) 

or emit IMULM ml / m2 result W0RD(m2) 



,SUB Sl ' S2 , „r-^/ -.N 

from REG(sl),REG(s2) emit I SUR sl,s2 result REG(sl) 
from REG(sl),W0R0(s2) emit I SUR sl,s2 result RFG(sl) 

or emit ISURM 51,52 result W0RD(s2) 

FMUL ml / m2 (commutative) 

from REG(ml),REG(m2) emit FMPR ml / m2 result REG(ml) 

from RFG(ml),W0RD(m2) emit FMPR ml / m2 result REG(ml) 

or emit FMPRM ml,m2 result W0RD(m2) 

FSUB 51,52 

from REG(sl) / REG(s2) emit FSBR ml,m2 result REP(sl) 

from REG(sl),W0RD(s2) emit FSBR ml,m2 result REP(ml) 

or emit FSBRM ml,m2 result W0Pn(m2) 



3.8 SUMMARY: THE STATE MACHINE 

The chapter outlines how a code generator performing 
computations can be pictured as a state machine. Then it 
shows how the state machine can be formalized and 
incorporated into DMACS, a system for building machine 
independent code generators. 

Once the state machine model is incorporated into 
DMACS, it becomes a tool that a language implementer can 
use. It is a convenient tool since it frees the language 
implementer from worrying about machine structure, from 
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having to perform tests to determine Input states to his 
macros, and from having to Implement transitions to 
permitted states. Thus, the macro logic that the language 
implementer specifies need only deal with particular 
semantic features of his source language. Therefore the 
semantics of the source language are logically divorced from 
any one target machine's structure. As a result, these 
macros become much simpler to write. Also, once these 
machine Independent macros are written, they can be 
implemented for a variety of machines from a machine 
description. 
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CHAPTER IV 
DATA REFERENCE MACROS 

k.l INTRODUCTION 
I*. 1.1 OVERVIEW 

Chapter 3 described a state machine model which is 
built into DMACS and used as a tool to create machine 
independent macros which can be filled out from a machine 
description. The state machine is useful to help model 
computational macros. 

Chapter k turns to the problem of achieving the same 
machine independence for data reference macros. To achieve 
this goal, a data definition facility is built into Df^CS. 



Source 

Data 

Declaration 



Target 

Machine 

Descr i pt ion 



t 



DMACS 



I 



Source Data 
Descr i bed 
For Target 
Machine 
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A machine specifier describes bis macblne memory structure 
and describes how source data Items are mapped Into that 
memory. From this description DMACS characterizes source 
data I terns In terms of the primitives of the data definition 
facility. The language designer writes his data reference 
logic In terms of the primitives of the facility using two 
built-in functions, called the INCREMENT and CONVERT 
functions. These functions operate on the primitives of the 
data definition facility. In effect, these two functions 
represent a machine independent model of data reference 
logic. A language Implementer can write data reference 
macros in terms of these built-in functions without 
worrying about how the data items of his language map Into 
the core memory of a particular machine. 

Chapter k is not an extension of Chapter 3. It 
pursues a similar goal in a new area: machine independence 
for data reference macros similar to that achelved In 
Chapter 3 for computational macros. 

4.1.2 DATA REFERENCE 

This section introduces the reader to the term 'data 
reference' as used in this chapter, and gives a simple 
example of data reference macros in action. The data 
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reference constructs dealt wfth fn thfs chapter are 
subscripted structures as found In PL/I. PI/I fs chosen 
both because It Is a well known language and also because It 
has powerful data referencing constructs. For simplicity 
the chapter deals only with structures whose size Is static 
and known at compile time. This restriction eliminates some 
of the messlness of PL/ I ' s structure Implementation and lets 
us concentrate on the basic problems of making such 
references machine Independent. If we allow dynamically 
varying structure sizes, then we must worry about what logic 
can be performed at compile time and what logic must be 
performed at run time by generated code. Restricting our 
attention to static structures frees us to concentrate more 
fully and more clearly on machine Independence of data 
re ference / rather than on the details of implementing 
dynamic structures for PL/1. The restrictions still allow 
useful and flexible data referencing constructs. 

A sample structure is the following: 

declare 1 A (10) fixed, 
2 X, 
2 B (10), 

3 Y (3), 

3 C (10), 

3 Z, 
2 Q (2); 

This declaration defines a subscripted structure. The 
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A(l) 



X 
B(l) 



B ( 1 ) 



Q(l) 
■Q(2) 



fY(l) 


Y(2) 


Y(3) 


C(l) 


C(2) 

• 


ccio) 


Lz 



■Y(l) 



Y(l) 



A(10) 



X 
BCD 



B(2) 



B(10) 
0(1) 



r 



[ 



*-Q(2) 
Sample Structure Layout 
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I ~>z 


$s 


A, 1 


1+1 


SUBST 


l/B 


1+2 


SS 


i+2,J 


1+3 


SUBST 


1+2,0 


1+U 


SS 


J+3,K 



This approach can handle any structure reference in a 
simple, general fashion. 

Having discussed the data reference macros to he 
dealt with, we now present a simple example of how code 
might be generated for the macros outlined above. This 
simple example assumes that the Items all represent full 
words of data on some particular machine. Later we shall 
extend this simplified situation to allow more complicated 
data items. 

Each structure item is characterized by two numbers: 

1. an offset from the beginning of its substructure 
element 

2. an element length 

The structure item B, for instance, has an offset of 1, and 
an element length of Ik. 

In generating code for the macros above, two running 
totals can be kept: compile time words- CW, and runtime 
words- RW. The running total represents a displacement Into 
the structure. At the end of the set of macros, the 
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displacement points to the correct terminal data Item. 

The following logic Illustrates how the macros might be 

expanded. In the Interest of clarity and simplicity, the 

code generated Is not as optimal as It might he. 

SS A, I records the offset of A, which Is 0, In CW, and 
generates code to multiply the element length of A (1^3) 
by 1-1. The result of the multiplication becomes RW. 
(1-1 Is used in the multiplication on the assumption 
that the first (zeroth) element is defined as A(D.) 

SUBST i,B adds at compile time the offset of B, which is 
1, to CW. 

SS i+l,J generates code to multiply the element length 
of B, 14, by J-l and add the result to RW. 

SUBST 1+2, C adds at compile time the offset of C, 3, to 
CW. 

SS i+3,K generates code to add K-l to RW. (no 
multiplication is necessary since the element length of 
C is 1). 

The result of all the computation is a pair of values 
(CW, RW) which represent a compile time displacement and a 
runtime index pointing to the desired data Item. On a 
machine like the IBM-360, this pair can be put directly into 
a machine instruction (ie. Load, Add) to access that data 
i tern. 

The above example illustrates the general operation of 
data reference macros. It is shown later that an expanded, 
but similarly clean, framework can be used to handle data 
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more complicated than the full word Items of this example. 
When data items are hytes and bitstrings, the logic of the 
macros is somewhat more complicated, and the eventual result 
is not a simple full word pointer, but rather a 'location' 
that can be input to a load/update routine which accesses 
the data-item pointed to. 

The most important point to notice in the example is 
that each structure item is characterized by an offset and 
an element length and that on different machines , these 
offsets and lengths might be different. A terminal data 
item is also characterized by two additional parameters, a 
load/update pair to access the item, and a data length which 
need not be the same as the element length (For instance, an 
array of 5-bit bitstrings alligned on word boundaries would 
have a data length of 5 bits, but an element length of one 
word.) These too could vary for different machines. 

h.2 THE DATA DEFINITION FACILITY 
*».2.1 DESCRIPTION OF DATA 

The previous section examined a simple example of data 
reference. This section presents a more precise framework 
for describing the type of data with which the chapter Is 
concerned. A data item can be characterized by a it-tuple 
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(OF,EL,DL,LU) describing how it is implemented on a 
particular machine. 

OF- the offset of the data item from the origin of the 
structure element to which it belongs 

EL- the element length of the structure element which 
that data item defines 

DL- the data length- the length of the piece of data 
which the data item represents 

LU- the load/update pair which accesses the data item. 

OF and EL can characterize any data item. DL and LU apply 
only to terminal data items. 

The following example illustrates how data items 

declared in a particular source program might he implemented 

differently on two different machines / the IBM-360 and the 

PDP-10: 

declare 1 A packed, 
2 B fixed, 
2 C char (2), 
2 D char; 

The PDP-10 is a word addressed machine with 36 bits/word. 
Assume a data item of type 'fixed' to be defined as a word 
item, and a character to be defined as a nine bit item. The 
IBM-360 is a byte addressed machine with 8 bits/byte. 
Assume a fixed data I tern to be defined as a word (four byte) 
item, and a character to be defined as a byte. Section 
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It. 2. 3 describes how such definition Is done. Granting these 
assumptions/ the storage layout for structure A Is as 
fol lows : 



PDP- 10: 

< — 



B 


C 1 Dl 



wor 



d X 



IBM- 360: 

< 



< > 



wor 



d > 



-*«-C -♦ 



3 



byte 



Thus the data Item 'A.D 1 Is described as follows on the two 
mach I nes : 

1. on the PDP-10 

OF- 1 word/ 18 bits 

EL- 9 bits 

0L- 9 bits 

LU- the loadupdate routine for bltstrlngs 

(Notice/ as an aside, that If one wanted to pack 5 
seven-bit characters Into a word/ then instead of an 
element length/ this I tern would have two numbers 
associated with It/ 36 and 5, Any Index Into an array of 
such characters would be multiplied by 36 and divided by 
5 to yield a bit displacement.) 

2. on the IRM-360 

OF- 6 bytes 

EL- 1 byte 

DL- 1 byte 

LU- the load update routine for bytes. 
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U.2.2 DATA DEFINITION 

To allow data reference macros to operate over data 
which can be described differently for different machines, 
certain problems must be solved. 

1. Suitable primitives must be found, flexible enough to 
describe offsets and lengths of data for a nunher of 
machi nes. 

2. An algorithm must be written which takes a structure 
declaration and a machine description and computes 
offsets and lengths describing that data for that 
machine, expressed in terms of these primitives. 

3. Data reference macros must be written in terms of 
these primitives, so that these macros will be machine 
independent. 

DMACS solves these three problems by using a HiHt ir 
data definition facility. The primitives of the data 
definition facility are addressable units and bits. All 
data is ultimately described in these terms. 

The remainder of the chapter first outlines how these 
primitives can be deduced from information supplied by a 
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machine specifier. The chapter then illustrates how DMACS 
can hel p a language implementer write data reference macros 
in terms of these primitives. 

4.2.3 DEDUCTION OF PRIMITIVES FROM A MACHINE DESCRIPTION 

DMACS characterizes data for any machine in terms of 
addressable units and bits. Information to make this 
characterization must be deduced from the machine 
description which specifies the following: 

1. core memory units: The machine specifier defines his core 
memory units (such as bits, bytes, words, double words, 
etc.), how these map into each other, and which is 
addressable. 

A sample declaration for the IBM 360 follows: 

mem B I T 

mem BYTE (8 BIT, addressable) 
mem WORD U BYTE, boundary 4) 
mem DWORD (8 BYTE, boundary 8) 

The attribute 'boundary 4' Indicates that an 
element with storage class WORD has an address 
congruent to zero, modulo 4. 

2. Source data types: The machine specifier must indicate 
which storage unit each source data type Is to be mapped 
into. It is here that a character data item mfpht be defined 
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as a byte on the IRM-3fif) and as a bitstring on the PPP-10. 

For a language where data can be packed or alligned, both 

storage units are indicated. DMAPS uses this information 

together with the core memory unit information to determine 

the offsets and lengths of data items from a source 

program. 

map fixed to WORD 

map char to BYTE 

map bit unaliigned to BIT 

map bit alligned to BIT allign WORP 

The last declaration indicates that when a 'bit* 

data item has been declared to be 'alligned', it 

is to be alligned on a WORD boundary. 

3. Load/Update routines: For each of the memory units, the 
machine specifier must define a load/update routine to 
access source data items mapped by the specifier into that 
memory unit. 

Some storage classes may bhave simple routines: 

mpath WORP->RFG: L RFG,WORP 

mpath REG->WORP: ST RFn,W0RP 

mpath BYTE->RFG: SR RFG,RFG;in RFG,RYTF 
(etc. ) 

Other load/update routines are more complicated, 

and are discussed in section k.l.h. 
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When a language data declaration Is processed, 
Information from such a machine description must he used to 
compute offsets and lengths for each data Item. The offsets 
and lengths can each he descrfhed by a 2-tuple (addressable 
units, bits). The tuple (4,2), for Instance, stands for 4 
addressable units and 2 bits. 

As a simple example, consider the following 

structure : 

declare 1 Z, 

2 A fixed, 

2 B bit (12), 

2 C bit (3), 

2 D 2 bit all igned; 

The following table indicates how the structure might he 

described for the IRM-360 and the PDP-10: 

(data) (offset)Oength) (of ^set )( length ) 
A (0,0) (4,0) (0,0) (1,0) 
B (4,0) (0,12) (1,0) (0,12) 
C (5,4) (0,3) (1,12) (0,3) 
D (8,0) (0,2) (2,0) (0,2) 
IBM-360 PDP-10 

Each tuple represents addressable units and hits. A is a 

fixed data item which is mapped into a full word on both 

machines (and hence 4 addressable units on the 360), and B, 

C, and D are mapped into bits. The flowchart of a general 

algorithm which wi 1 1 take a structure and describe it using 

these primitives is given in Figure 4.1. 
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This algorithm illustrates how offsets and element 
lengths can be computed for data items. Two stacks are 
used: DISP and STACK. 



DISP 1 












addr 


bits 



STACK 



n amp 



The stacks are pushed each time a new structure level is 
encountered, and are popped each time a level ends. Each 
entry of DISP has two fields, one for addressable units and 
one for bits, which record displacement from the beginning 
of the current structure level. STACK is used to store the 
name of the current data item at each level. For each data 
item an offset (OP)and an element length (EL) is computed. 

k.l.k COMPLEX LOAD/UPDATE ROUTINES 

The previous section ?ave examples of simple 
load/update routines for addressable data items. 
Load/update routines for non-addressable items (?e. bit 
strings) are more complicated for several reasons. 

1. They take as input an address, a bit displacement, 
and a bit length. 

2. Bit displacements can be runtime or. compile time 
values. 
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done 
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3. Bitstrings can run across word boundaries. 

Two possible solutions are available to bandle this 
problem. The easiest solution is to require that the 
machine specifier provide a subroutine which makes the 
appropriate checks, and executes the correct load and shi^t 
instructions for the different situations. The seconH 
solution is to allow the machine specifier to define open 
code sequences to be generated, at least for the simpler 
cases (for instance, when bit displacement is known at 
compile time, and hence it can be determined that the item 
does not cross a word boundary). 

A sample load routine for the IBM 360 might appear 
somewhat as follows: 

mpath BIT->RFG: L REG,W0RP 

SLL RFP,DISP 
SRL RFn,32-LFN 

The whole problem of exactly bow to allow a machine 

specifier to define open sequences of this sort Is a 

difficult one. It is to a large degree an implementation 

problem for a DMACS builder, rather than a conceptual 

problem of machine independence. It Is therefore left 

somewhat open in this chapter. 
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U.3 MACHINE INDEPENDENT MACROS 
U.3.1 DATA MACRO LOGIC 

The previous sections have Illustrated how data Items 
on any machine can be characterized In terms of the 
primitives of a data definition facility. This section 
describes how machine independent data macros can he written 
In terms of the primitives of the facility In a clean, 
simple fashion. 

As outlined in section it.1.2, the operation of a data 

macro consists of incrementing a pointer into a data base, a 

pointer consisting of both runtime and compile time values. 

In the machine independent macros which a language 

implementer writes, all offsets and lengths are exprpssed in 

addressable units and bits. Thus the pointer being 

incremented can be seen as a l»-tuple: (CA, RA, CR, RB) . 

CA- compile time addressable units 
RA- run time addressable units 
CB- compile time bits 
RB- run time bits 

Any element may be nil: ie. If CA is nil, the pointer has 

not been incremented by any compile tine addressable units. 

The process of incrementing the pointer can bp exprpsspd Hy 

the following graph, called the INCP C MENT function: 

- Sh - 



e g/ni l 
CAT (CAn i 




ca/Add-c(CA,ca) 



cb/nil 




cb/Add-c(CB,cb) 




ra/Add-r(RA, ra) 



rb/nil 




rb/Addrr(RB,rb) 



This graph contains four pairs of states. Fach pair is a 
state machine recording the presence or absence of one 
element of the ^-tuple. Hence, if the pointer has no 
runtime bits, the third pair is in the state 'nil'. Fach 
pair starts in the state 'nil'. Input is represented by 
'ca 1 , 'cb 1 , 'ra', and ' rb ' . Actions to be taken are either 
Add-c, representing addition at compile time, or Add-r, 
representing addition at run time, or nil. When the pointer 
is incremented by a new value, the appropriate state machine 
makes a transition. If this is the first transition for 
that machine, then there is a change o f state with no action 
performed. If this is not the first transition, then either 
a compile time 'Add-c' is performed, or codp is generated to 
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to perform a run time 'Add-r'. 

The graph is a machine independent model of the 
operation of a data reference macro. It is machine 
independent because it operates on the primitives of a data 
definition facility in which data for a variety of machines 
can be automatically expressed. 

Since the INCREMENT function is machine independent 
it is built into DMACS. Using this function / the language 
designer can write his macros without worrying about how 
different data items map into core. In a similar manner, a 
CONVERT function to convert the pointer into a data item 
'location' (to be input to a load /update routine) is built 
into DMACS. The logic for this routine is discussed in the 
next section. 

Using these two built in routines, a language 
designer can write a subscript macro with the following 
logic: 

SUBSCRIPT X, I 

1. subtract 1 from I yielding Value(l-l). 

2. if the element length of X is (1,0) or (0,1) 

then INCREMENT X by Valued -1) 
else multiply Value(l-l) by the element length of X 
and INCREMENT X by the result 

3. if X is a terminal data item 

then apply CONVERT to the pointer computed above, 
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(When i ncrement ing X with a valuo, the units of the element 
length of X determines which component of the pointer is 
incremented. If I is a compile time value, the suhtraction 
and multiplication can he done at compile time. Otherwise 
code must be generated to perform these operations at run 
time. ) 



The macro is completely free of machine dependent 
detail. Using the two functions huilt into DMACS, the 
language implementer is able to write a macro dealing only 
with the semantics of his source language. For instance, 
such a macro might include logic to handle subscript hounds 
or to handle special types of subscripting such as for 
triangular matrices, but need pay no concern to machine 
structure at all. 

U. 3. 2 THE CONVERT FUNCTION 

The CONVERT function takes a pointer in the form of 
a U-tuple (CA, RA,CR, RR) as discussed in the previous 
section, and converts it into a form suitable for use by a 
load/update routine. When the pointer is expressed in 
addressable units and references a simply accessable item, 
conversion is not necessary. When the pointer includes 
bits, however, the bit elements of the pointer must he 
normalized to yield a number of addressable units, and a 
local bit displacement within the memory unit pointed to by 
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the addressable elements RA and CA. 

First we discuss the problem of normalization, and 
then how It fits Into the CPNVFRT logic as a whole. 

NORMALIZATION: Consider the problem of accessing a 
bftstrlng on the PDP-10 and on the IBM-360 given the base 
address of a data area and a bit Index Into It. On the 
PDP-10, the Index should be divided by 36 (bits/word), 
yielding a full-word Index as the quotient, and the bit 
displacement as remainder. On the 360, assuming the 
load/update routine uses full word load Instructions, the 
address of a full-word boundary is wanted, together with a 
bit displacement to within that word. Therefore, the index 
should be divided by 32 (bits/word), yielding a bit 
displacement as remainder. Multiplying the quotient by k 
would then yield an index In addressable units. Thus a data 
type may have the following attributes when implemented on a 
particular machine: Nd- a number to divide a bit pointer by, 
to yield a 'local' bit pointer as a remainder, Ma- a number 
to multiply the result of that division by to yield 
addressable units. 

When the bit pointer Is a compile time value, this 
normalization is performed at compile time. Otherwise, code 
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mu 



st be generated to perform the normalization at run time. 



CONVERT: The convert function takes a pointer 
P=(CA / RA / CB / RR) and converts It to a location L=(CA, RA, cbl ) 
or L=(CA / RA / rbl ), where cbl and rbl are local bit 
displacements Into a memory unit. The logic of this 

function Is: 

1. If CR=nil and RR=nil, then normalize CR at compile 
time yielding ca and cbl. Then IMHPFMFNT P by ca 

2. If RR^nil then do 

( a. if CBj* nil then 

(generate code to arW RP and CR yielding RP) 

b. generate code to normalize RR yielding ra and 
rbl 

c. INCREMENT P with ra ) 

This function yields an expression which can be input 
to a load/update pair. This function operates on the 
primitives of a data-definition facility and is therefore 
machi ne- independent. 

h.k SUMMARY 

The chapter describes how a data definition facility 
is built into DMAPS to facilitate the writing of machine 
Independent macros. Then it discusses how this facility Is 
used: how the machine specifier describes his machine 
memory, accessing functions, and the mapping of source data 
types Into core; and how DMACS then uses the information to 
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compute the primitives which describe a source program's 
data. The chapter then discusses how machine independent 
macros are written in terms of two machine independent 
functions (INCREMENT and CONVERT), operating over these 
primitives. These two functions embody the substance of the 
machine related part of data reference macros. They are 
built Into DMACS to be used as a tool by the language 
Implementer. 

The basic concept set forth In this chapter is the 
use of a data definitional facility. The rest of the 
chapter Is built around this idea. It Is Instructive to ask 
how much more flexibility the definitional facility affords 
over a code generator for a single target machine. At first 
glance, it might appear that the definitional facility 
merely lets DMACS describe data with different numbers on 
different machines, but perform the same manipulations with 
those numbers in all cases. This is not true. The 
definitional facility gives the language implementer the 
ability to handle a given source data reference with 
different sections of his logic on different machines. Thus 
an array of characters can be handled for the IRM-360 as an 
array of addressable units with element length of 1, whereas 
on the PDP-10, it would be handled as an array of bitstrings 
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of length 9. The operations performed on these element 
lengths could be different, and the load update routines 
used to access the Items could be different. 

Thus the definitional facility of DMACS provides a 
flexible interface between machine structure and macro 
logic. At the same time, it is an interface that is almost 
invisible to both the machine specifier and the language 
implementer. The language implementer is able to think 
primarily in terms of the semantics of his language 
irrespective of machine structure, and the machine 
specifier merely gives a description of his machine. DMAPS 
takes care of binding the macros and the machine description 
together. 
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CHAPTER V 
CONCLUSIONS AND FURTHFR WORK 

5.1 ARFAS FOR FURTHER WORK 
5.1.1 FURTHFR ASPECTS OF CODE OENER^TION 

The scope of the present research ?s limited since ft 
does not address the task of making an entire compiler 
machine Independent. Only two classes of macros are 
studied, and only a limited set of possible operand types 
are allowed. Also, many machine idiosyncrasies, such as 
Interrupt handling, are Ignored. 

The problem of making a powerful compiler machine 
Independent Is a difficult and a messy one. The problem Is 
somewhat softened by the fact that many machine 
Idiosyncracies can properly be handled by subroutines, and 
thus may not prove to be Insurmountable stumbling blocks. 

One slgnlflgant area not dealt with Is the class of 
control macros, such as subroutine calls, entry and return 
macros, etc. These macros may not, however, require any 
elaborate mechanisms to allow machine independence. In 
general, such control macros are Implemented very similarly 
on different machines and may be describaMe merely by 
appropriate code sequences. One minor problem is to assure 
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that the stack Is allocated in the correct units. 

Types of operands not considered Include character 
strings and decimal operands such as those found on the 
IBM-360. Both of these types of operands are generally not 
manipulated via registers, hut rather by subroutine or by 
special memory-memory Instructions. The model of 
computation in Chapter 3 is oriented primarily towards 
manipulating values using registers. More work is also 
needed to determine exactly how load/update routines can 
best be defined to fit into a machine Independent 
framework. 

5.1.2 EXTENDING THE MODELS 

The models presented in this paper are set forth 
primarily to isolate some basic Ideas Involved In code 
generation, and to provide a basis for more general 
extensions which could include a broader spectrum of machine 
structure. 

In particular, one might relax some of the 
constraints imposed on register structure in Chapter 3, 
(perhaps to include such machines as a stack machine), and 
develop an automatic mechanism for attaining permitted 



93 - 



states in this less constrained system. By relaxing 
constraints in this fashion, it might he possible to obtain 
a number of different automatic mechanisms, together with 
classes of machine structures which can be handled by each 
mechan ism. 

In a similar vein, one might consider different 
possible addressing structures, and determine how the 
machine independent data reference logic can be modified to 
accomodate them. In particular, It might be useful to look 
at addressing on small machines, such as the PPP-8, which 
tend to have anomolous addressing strategies due to hit 
conserving design considerations. In fact, such machines 
might be practical candidates for a descriptive system like 
DMACS, since they tend to be reasonably similar, and since 
they tend to be unsuitable for sustaining compilers 
themselves. 

5.2 SUMMARY OF RESULTS 

The present research has examined the two most common 
types of macro used for handling arithmetic values: 
computation macros, and data reference macros. For each of 
the two types of macro, the paper develops a machine 
independent formalism which models the machine dependent 
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aspects of the macro's Iokic: a state machine for 
computation macros; and the INCRFMFNT and CONVERT functions 
for data reference macros. 

Chapters 3 and U show how the models can be 
incorporated Into DMACS, a descriptive macro system. A 
language Implementer can use the models as tools, writing 
his macros In terms of machine independent primitives which 
invoke the model. A machine specifier can then descrihe his 
machine, and descriptively fill out the primitives as they 
apply to his machine. 

Thus the research has several purposes: 

1. The research is a first attempt to formalize some of the 
logic involved in generating code for high level languages. 

2. The research is an attempt to see what is involved in 
attaining machine independence in a code generator, similar 
to the language independence and the token independence 
acheived by automatic parsing and automatic lexical 
systems. 

3. Towards this end, this paper explores the question of 
just what might reasonably constitute a 'description' of a 
machi ne. 
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U. The research helps make clearer the distinction between 
the semantics of a high level language and the structure of 
a target machine, a distinction that is often unclear in a 
compiler oriented towards a single machine. 
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